Limit distribution of degrees in random family 

trees 

Agnes Backhausz 

Department of Probability Theory and Statistics 

Eotvos Lorand University 

Pazmany Peter setany 1/c, H-1117 Budapest, Hungary 

Email: agnes@cs.elte.hu 

Abstract 

f^ . In a one-parameter model for evolution of random trees, which also 

CN ' includes the Barabasi- Albert random tree [1 , almost sure behavior and 

the limiting distribution of the degree of a vertex in a fixed position are 
examined. Results about Polya urn models are applied in the proofs. 
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Ph ■ 1 Introduction 

"^ ' Evolving random graphs and random trees have been widely examined re- 

cently, see e.g. (HO [7]. One of the simplest dynamics is the following. At 
each step one new vertex is born, and it attaches with one edge to one of 
the old vertices. The probability that a given old vertex is chosen is propor- 
^ ■ tional to a fixed linear function of its actual degree. The asymptotic degree 

CN \ distribution is well-known. These trees have the so-called scale free prop- 

erty: the proportion of vertices of degree d converges to Cd almost surely as 

t::j- ' the number of vertices goes to infinity, and Cd ^ c- d~'^ (d -^ oo) with some 

r^ . positive constants c and 7. 

O I Instead of the degree distribution, we focus on the degree of a vertex in 

a given position in the tree, as the number of vertices goes to infinity. Fix a 
vertex, e. g. the root of the tree, or the jth child of the root, or the kth. child 
of the j'th child of the root, etc. X^ denotes the degree of this vertex after 

/\ ■ n steps. We will see that n~ Xn converges to a positive random variable 

^ . almost surely, with some 6 > 0, and we will have some information on the 

moments and the structure of this random variable. We will describe the 
distribution of these random variables for the Albert-Barabasi tree, where 
the probability that a given old vertex is chosen is proportional to its actual 
degree. We will also examine a variant, the generalized PORT model, where 
the number of children is relevant instead of the degree. 



2 Random trees 

2.1 Notations 

Trees are connected graphs without cycles. We assign a vertex, this is the 
root of the tree. We consider trees growing at discrete time steps. More 
precisely, we start from the root, and add one new vertex with one edge at 
each step. When examining the neighbors of a certain vertex, we keep count 
of the order they were born. Thus our graph is a rooted ordered tree (also 
known as rooted planar tree or family tree). See for example [3]. 

We will use the following commonly known terminology and notation for 
rooted ordered trees [3l[7]. The vertices are individuals, and the edges of the 
tree represent the parent-child relations. When a new vertex with one edge 
is added to the graph, we say that it becomes the child of its only neighbor, 
its parent. 

We will label the vertices with sequences of positive integers, based on 
the parent-child relations. We set 



{1,2,...}, zi = m, M=[j 



n=0 

The label of the root is 0. The jth children of the root is labelled with j. 
Similarly, the jth children of the vertex labelled with x = {xi, . . . , Xk) € Af 
is labelled with {xi, . . . ,Xk,j)- To put it in other way, the vertex with 
label X = (xi, . . . ,Xk) € TV is the x^th children of the vertex with label 
{xi, . . . , Xk-i), which is the Xfc_ist children of its parent, and so on. 

Note that trees can be represented by the set of the labels of their ver- 
tices; the labels give all information about the edges. In the sequel we 
identify vertices with their labels, and trees with the set of labels. The set 
of finite rooted ordered trees is denoted by Q. We say that the vertex with 
label X = {xi, . . . ,Xk) € M belongs to the kth generation of G € ^. The 
degree of a vertex x m G will be denoted by deg (x, G). 

2.2 The random tree model with Hnear weight function 

We consider randomly growing trees. At each step we add a new vertex, 
which attaches to a randomly chosen, already existing vertex with one edge. 
The probability that a vertex of degree d gets the new edge is proportional 
to a fixed linear function of d. 

To formulate this, let /? > — 1 be fixed, and let G„ £ Q (n G N) be a 
sequence of random finite rooted ordered trees, such that Gi = {0}, and the 
following holds for all n. A; € N, x = (xi, . . . , Xk) G G„ and d = deg (x, G„). 

P(G„+i = G„U{(xi,...,Xfc,d)}|G„) = ^^, 



where S'n = 2n — 2 + n/3 = J2v€G (^^S (^' ^") + /^)- '^^^ condition /3 > — 1 
guarantees that the given probabihties are positive. 

We say that the weight of a vertex of degree d is d + (3. In other words, 
each vertex has weight 1 + /3 when it is born, and its weight increases by 1 
every time when it gives birth to a child. 

3 Main results 

Our goal is to describe the almost sure behavior and the limiting distribution 
of the degree of a fixed vertex x in G„ as n — )• oo. 

For the root, the moments of the limiting distribution are calculated in 
[6]. It is proven that 

deg(0,Gn) . ,,. 

nV(2+/3) ^ ^0 ^^^ 

as n — )• oo, with probability 1, with a positive random variable Co- Moreover, 
the following holds for every integer k > 1. 

Er^ = k^ ^^ + ^) (k + ^\_ ^{^ + 2h) r(fc + /3 + i) ^^^ 
'' ■ r(i + Sf) V ^ ; r(/3 + i) r(i + |M^ 

Our main result is the following. 
Theorem 1 Let k € Z+ and x = (xi, . . . , Xfc) G A/" be fixed. Then 

deg(x,Gn) 



ni/(2+/3) 



Cx 



with probability 1, with some positive random variable Qx- The distribution 
of Cx is the same as the distribution of (q ■ ^i ■ . . . ■ (,k, where 

• Coj'^i) • • • j'^fc o.re independent random variables; 

• Co is defined by equation ([TJ; 

• Ci has distribution Beta{l + /?, xi — 1) if xi > 1; Ci = 1 if xi = 1; 

• Cs has distribution Beta{\ + 13, Xs) for 2 < s < k. 

Proof. We prove the theorem by induction on k. 

For /c = 1, let J € N be fixed. Vertex x = j is the jth child of the root. 

For J = 1 we have one edge, for symmetry reasons it is clear that Ci and 
Co are identically distributed. 

For j > 1, assume that vertex j appears in the A^th step, that is, j S 
Gn \ Gn-i. A^ is a random positive integer. After the birth of vertex j, 
we divide the weight of the vertices into two parts, a "black" and a "white" 



one. Wn (x) and Bn (x) denote the "white" and "black" weight of vertex x 
in Gn, respectively, for n > N. The total weight of a vertex is equal to its 
actual degree plus /?, thus we have 

deg{x,Gn) + l3 = Wn{x)+Bn{x) {n>N,xeGn). (3) 

We set 

Wn ij) = 1 + /3, Bn (j) = 0; 

Wn (0) = 1 + /3, Bn (0) = i - 1; 

Wn{x) = 0, BN{x) = deg{x,GN) + (3 {x e Gn \{9,j}) ■ 

This is possible, because vertex j has weight 1 + /3 when it is born, and 
the root has degree j at the same time. All the other weights are colored 
black. 

Later on, when a new vertex appears, it gets black weight 1 + /3, its 
total weight is black. When an old vertex, x, gets a new edge, its weight 
increases by 1. The color of this increment will be randomly chosen; the 
probability that the increment is white is the ratio of the white part to the 
total weight of x. We formulate this in the following way. For n > iV, A; E N, 
X = {xi, . . . , Xk) € Gn let d = deg (x, G„) if x / 0, and d = 1 + deg (x, G„) 
if X = 0. We set 

P (G„+i = G„ U {(Xi, . . . , Xfe, d)} , Wn+i (X) = Wn (X) + 1| Gn) 
_d + P Wn{x) _ Wn (X) 

bn d -\- p bn /.\ 

F{Gn+l = GnU{ixi,...,Xk,d)},Bn+lix)=Bnix) + l\Gn) 
_ d + /3 Bn{x) _ Bn (x) 
Sn d + /3 Sn ' 

Of course, if Wn+i (x) = Wn (x) + 1, then Bn+i (x) = Bn (x), and if 
Bn+i (x) = Bn (x) + l, then Wn+i (x) = Wn (x); the total weight is increased 
by 1 in both cases. 

Note that vertices, except the root and its jth child, never get white 
weight, which implies that 

Wn{x) = 0, 5„(x) = deg(x,G„) + /3 (n > iV,x € G„ \ {0,i}) . 

On the other hand, the total weight of vertex j is white when it is born, 
thus we have 

Wn{j) = deg{j,Gn) + (3, Bn{j) = {n>N). (5) 

We colored the weights in such a way that 

Wn (j) = WNm = l + /3. 



Prom formula dU it follows that the probability that the white weight of 
a vertex increases depends only on its actual white weight, independently of 
the structure of Gn'-, and Sn is deterministic. Thus, for symmetry reasons, 
we have that if 



w„. m 



^l/(2+/3) ^ ^3 

almost surely for some positive random variable rf^, then 



'3 
^l/(2+/3) ^ '13 

almost surely as well, where rp, and r^j are identically distributed. Thus we 
will focus on the behavior of the root. 

Recah that Wn (0) = 1 + /3, Bn (0) = i - 1- From equation (g]) it follows 
that 

P(M^„+i (0) = VF„ (0) + 1| deg(0,G„+i) = deg(0,G„) + 1) 
W„. (0) 



Wn (0) + Bn (0) ' ,g, 

;B„+i(0) = 5„(0) + l[deg(0,G„+i) = deg(0,G„) + l) 



Bn 



Wn (0) + Bn 



This corresponds to a Polya-Eggenbegger urn model. We have an urn 
with a white and h black balls. At each step, we draw a ball with uniform 
distribution, and put it back together with c balls of the same color. It is 
well known (see e.g. [3j) that the proportion of the white balls converges 
almost surely to a Beta {a/c, b/c) distributed random variable. Moreover, 
saying that the number of the balls corresponds to "white" and "black" 
weights, this remains the same for positive, not necessarily integer weights. 
In this case, at each step, we choose a random color. The probability that 
white is chosen is equal to the actual proportion of the white weight in the 
urn. Then the weight of the selected color is increased by c. As equations 
^ show, this happens at the steps when the root gives birth to a child. 
Thus we can apply these results with a = l + /3,6 = j — 1 and c = 1, and 
we get that 

W„(0) 

Wn (0) + Bn (0) ^' 

almost surely as n — )• cxd, and ^^ has distribution Beta{l + /3, j — 1). 

From this result, equations ([T]), ([3|) and the condition /3 > — 1 it follows 
that 



VFn(0) Wnm deg(0,G„)+/3 



n 



V(2+/3) W (0) + Bn (0) nV(2+/3) 



e?-Co = ^? 



almost surely as n — )• cxd. ^^ and Co are independent, because the almost 
sure convergence of the proportion of the white weight does not depend on 
the behavior of the total weight of the root. As we have seen before, this 
implies that 

Wnij) ^ 

ni/{2+/3) ^ "l^ 

almost surely as n — )• oo, where t]^ and r]j are identically distributed. Thus, 
using equation ^ we get that 

degiJ,Gn) _ WniJ)-(3 

„l/(2+/3) „l/(2+/3) ^ "^^ 

almost surely as n — >■ cxd, and r]j has the same distribution as Co 'Cji where Co 
is defined by equation ([T]), S,j has distribution Beta{l + f3,j — 1), and finally, 
Co and £,j are independent. This completes the proof for the case k = 1. 

Assume that the statement is proven for some A; > 1. We fix x = 
(xi, . . . ,Xfc) G N and Xk+i G Z+- The induction step can be verified by a 
slight modification of the previous argument. The only difference is that x 
has degree Xk+i when its x^+ist child is born, namely, one edge is attached 
to its parent, (xi, . . . ,Xfc_i), and x^+i — 1 to its already existing children. 
This means that a = 1 + (3 and h = Xk+i in the urn model, and the last 
factor ^fc_|_i has distribution Beta{l + /3, x^+i). D 



4 A particular case and a variant 

4.1 Albert— Barabasi tree 

For /3 = the probability that a given vertex of degree d gets the new edge is 
proportional to d. This is the Albert-Barabasi random tree [1] . In this spe- 
cial case it is possible to determine the distribution of Co- Namely, 2~^/^Co 
is identically distributed with the absolute value of a standard normal ran- 
dom variable. To verify this, one can get the even moments of 2~^' ^Co from 
equation ([2]), namely, 

which are just the even moments of the standard normal distribution. 

4.2 Generalized PORT model 

Plane oriented random trees (PORT) are also rooted ordered trees. In this 
case the tree is embedded in the plane, and the left-to-right order of the 
children of the vertices is relevant (see e. g. [3|). The out-degree of vertex v 
in tree G is the number of its children, and it will be denoted by deg"*" {v, G). 

6 



It is equal to the degree of the vertex minus one, except for the root, where 
it is the same as the degree. 

If a vertex v has out-degree d, then there are d + 1 possible ways to 
attach a new vertex to v. We get plane oriented random trees if all these 
possibilities are equally likely. Namely, the probability that a vertex of out- 
degree d gets the new edge is proportional to d-|- 1. In the generalized PORT 
model, this probability is proportional to d + (3 for some /? > f5^, '7] . 

We can get similar results to Theorem [TJ 

Theorem 2 Let k G Z^- and x = (xi, . . . , x^) £ M be fixed. Then 

deg {x,Gn 



ni/(i+/3) 



Cx 



with probability 1, with some nondegerate random variable C,x- The distribu- 
tion of C,x is the same as the distribution o/ ("q " '^i ' • • • ' Cfe; where 

• Coj'^i) • • • )Cfc fl'^e independent random variables; 

• S,s has distribution Beta{(3,Xs) for 1 < s < k. 

After showing the existence of (q, the proof is a straightforward modifi- 
cation of the proof of Theorem [H therefore we omit it. 

To verify that Co exists, we apply the results of Gouet about generalized 
Polya urn models [5]. We color the root white and all the other vertices 
black. At the beginning, the root has weight 1 + /3, while the only black 
vertex has weight f3. When the root gives birth to a child, its weight is 
increased by 1, and the black weight is increased by /3. On the other hand, 
if the new vertex does not attach to the root, then the white weight does 
not change, while the black is increased by 1 + /3. Thus we have the matrix 

^=(o 1 + /3 

Recall that /3 > 0. It is easy to check that all assumptions of Proposition 
2.2 in [S] hold, except that /3 is not necessarily an integer. This is also easy 
to handle, and we obtain for Wn, the white weight, after n steps, that 

"» z 



ni/(i+/3) 



almost surely, where Z is a nondegenerate random variable. This guarantees 
the existence of Co- 

The moments of Co can also be determined by the method applied in [6]. 
Fix a positive integer k, X^ denotes the degree of the root after n steps, and 
let 

"-' ' k 



J] 1 + ^ \ ^ „k/(l+p) 



^^=liV^ + I(TT^^FT)-- 



One can verify that 

is a nonnegative martingale; it is convergent almost surely, and bounded in 
Lp for every p > 1. Hence we obtain by further calculations that 



r(^)r(/3) 

For /3 = 1 we get the PORT model. In this special case ^q is identically 
distributed with 2^^'^, where ^ has exponential distribution with expectation 
1. 
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